Sampling Bias in BitTorrent Measurements
نویسندگان
چکیده
Real-world measurements play an important role in understanding the characteristics and in improving the operation of BitTorrent, which is currently a popular Internet application. Much like measuring the Internet, the complexity and scale of the BitTorrent network make a single, complete measurement impractical. While a large number of measurements have already employed diverse sampling techniques to study parts of BitTorrent network, until now there exists no investigation of their sampling bias, that is, of their ability to objectively represent the characteristics of BitTorrent. In this work we present the first study of the sampling bias in BitTorrent measurements. We first introduce a novel taxonomy of sources of sampling bias in BitTorrent measurements. We then investigate the sampling among fifteen long-term BitTorrent measurements completed between 2004 and 2009, and find that different data sources and measurement techniques can lead to significantly different measurement results. Last, we formulate three recommendations to improve the design of future BitTorrent measurements, and estimate the cost of using these recommendations in practice.
منابع مشابه
On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File-Sharing Networks
The BitTorrent peer-to-peer file-sharing network is currently one of the dominant Internet applications. Understanding the characteristics of BitTorrent through real-world measurements is key to improve the quality of service for tens of millions of BitTorrent users, but the complexity and scale of BitTorrent make a single, complete measurement impractical. Thus, an increasing number of real me...
متن کاملAn Adaptive Trust Sampling Method for P2P Traffic Inspection
This paper focuses on the sampling-based Deep Packet Inspection for the traffic of P2P file sharing systems, especially for BitTorrent, and proposes a logarithmic-based Adaptive Trust Sampling (ATS) strategy for P2P traffic identification. In the whole process of sampling identification for P2P traffic, the sampling ratio of the current node in a P2P network can automatically adjust and dynamic...
متن کاملLatency-driven BitTorrent
In recent years BitTorrent has become a notorious contributor to Internet traffic. Not only is BitTorrent responsible for over one third of all Internet traffic, but an immoderate amount of it is expensive cross-ISP or even inter-continental traffic. Much of BitTorrent’s long-distance traffic is due to its random selection of peers, which can cause connected peers to be at very different locati...
متن کاملIs BitTorrent Unstoppable?
Anti-P2P companies have begun to launch Internet attacks against BitTorrent swarms. We use passive and active Internet measurements to study how successful these attacks are at curtailing the distribution of targeted content. For our active measurements, we develop a crawler that contacts all the peers in any given torrent, determines whether leechers in the torrent are under attack, and identi...
متن کامل